68 research outputs found
Investigating Simple Object Representations in Model-Free Deep Reinforcement Learning
We explore the benefits of augmenting state-of-the-art model-free deep
reinforcement algorithms with simple object representations. Following the
Frostbite challenge posited by Lake et al. (2017), we identify object
representations as a critical cognitive capacity lacking from current
reinforcement learning agents. We discover that providing the Rainbow model
(Hessel et al.,2018) with simple, feature-engineered object representations
substantially boosts its performance on the Frostbite game from Atari 2600. We
then analyze the relative contributions of the representations of different
types of objects, identify environment states where these representations are
most impactful, and examine how these representations aid in generalizing to
novel situations
Learning a smooth kernel regularizer for convolutional neural networks
Modern deep neural networks require a tremendous amount of data to train,
often needing hundreds or thousands of labeled examples to learn an effective
representation. For these networks to work with less data, more structure must
be built into their architectures or learned from previous experience. The
learned weights of convolutional neural networks (CNNs) trained on large
datasets for object recognition contain a substantial amount of structure.
These representations have parallels to simple cells in the primary visual
cortex, where receptive fields are smooth and contain many regularities.
Incorporating smoothness constraints over the kernel weights of modern CNN
architectures is a promising way to improve their sample complexity. We propose
a smooth kernel regularizer that encourages spatial correlations in convolution
kernel weights. The correlation parameters of this regularizer are learned from
previous experience, yielding a method with a hierarchical Bayesian
interpretation. We show that our correlated regularizer can help constrain
models for visual recognition, improving over an L2 regularization baseline.Comment: Submitted to CogSci 201
Learning Inductive Biases with Simple Neural Networks
People use rich prior knowledge about the world in order to efficiently learn
new concepts. These priors - also known as "inductive biases" - pertain to the
space of internal models considered by a learner, and they help the learner
make inferences that go beyond the observed data. A recent study found that
deep neural networks optimized for object recognition develop the shape bias
(Ritter et al., 2017), an inductive bias possessed by children that plays an
important role in early word learning. However, these networks use
unrealistically large quantities of training data, and the conditions required
for these biases to develop are not well understood. Moreover, it is unclear
how the learning dynamics of these networks relate to developmental processes
in childhood. We investigate the development and influence of the shape bias in
neural networks using controlled datasets of abstract patterns and synthetic
images, allowing us to systematically vary the quantity and form of the
experience provided to the learning algorithms. We find that simple neural
networks develop a shape bias after seeing as few as 3 examples of 4 object
categories. The development of these biases predicts the onset of vocabulary
acceleration in our networks, consistent with the developmental process in
children.Comment: Published in Proceedings of the 40th Annual Meeting of the Cognitive
Science Society, July 201
Generating new concepts with hybrid neuro-symbolic models
Human conceptual knowledge supports the ability to generate novel yet highly
structured concepts, and the form of this conceptual knowledge is of great
interest to cognitive scientists. One tradition has emphasized structured
knowledge, viewing concepts as embedded in intuitive theories or organized in
complex symbolic knowledge structures. A second tradition has emphasized
statistical knowledge, viewing conceptual knowledge as an emerging from the
rich correlational structure captured by training neural networks and other
statistical models. In this paper, we explore a synthesis of these two
traditions through a novel neuro-symbolic model for generating new concepts.
Using simple visual concepts as a testbed, we bring together neural networks
and symbolic probabilistic programs to learn a generative model of novel
handwritten characters. Two alternative models are explored with more generic
neural network architectures. We compare each of these three models for their
likelihoods on held-out character classes and for the quality of their
productions, finding that our hybrid model learns the most convincing
representation and generalizes further from the training observations.Comment: Published in Proceedings of the 42nd Annual Meeting of the Cognitive
Science Society, July 202
- …